BINNED_STATISTIC_2D

Overview

The BINNED_STATISTIC_2D function computes a bidimensional binned statistic for data points distributed across a two-dimensional space. This function is a generalization of a 2D histogram: while a histogram counts the number of points falling into each bin, BINNED_STATISTIC_2D allows computation of various statistics—including mean, sum, median, count, standard deviation, min, and max—for the values associated with points within each bin.

This implementation wraps the scipy.stats.binned_statistic_2d function from the SciPy scientific computing library. For related functionality, see also numpy.histogram2d for standard 2D histograms and scipy.stats.binned_statistic for the one-dimensional version.

The function divides the x-y plane into a grid of rectangular bins based on the specified number of bins and optional range limits. Each data point (x_i, y_i) is assigned to a bin, and the corresponding value v_i contributes to the statistic calculated for that bin. For example, when using the mean statistic, the function computes:

\bar{v}_{\text{bin}} = \frac{1}{n_{\text{bin}}} \sum_{i \in \text{bin}} v_i

where n_{\text{bin}} is the number of points in the bin. Empty bins are represented as blank cells in the output.

The bin edges follow a half-open interval convention: for edges [e₁, e₂, e₃], the first bin includes values in [e₁, e₂) and the second bin includes [e₂, e₃]. The rightmost bin is closed on both ends to include the maximum value. Points falling outside the specified range (when xlowerlimit, xupperlimit, ylowerlimit, and yupperlimit are provided) are excluded from the computation.

Common applications include spatial data analysis, creating heatmaps of aggregated values, analyzing geographic distributions, and summarizing sensor data across a 2D field.

This example function is provided as-is without any representation of accuracy.

Excel Usage

=BINNED_STATISTIC_2D(x, y, values, bstwod_statistic, bins, xlowerlimit, xupperlimit, ylowerlimit, yupperlimit)
  • x (list[list], required): Data to bin along the first dimension (x-axis).
  • y (list[list], required): Data to bin along the second dimension (y-axis).
  • values (list[list], required): Data on which to compute the statistic.
  • bstwod_statistic (str, optional, default: “mean”): The statistic to compute for each bin.
  • bins (int, optional, default: 10): Number of bins for both dimensions.
  • xlowerlimit (float, optional, default: null): Lower bound for the x bin range.
  • xupperlimit (float, optional, default: null): Upper bound for the x bin range.
  • ylowerlimit (float, optional, default: null): Lower bound for the y bin range.
  • yupperlimit (float, optional, default: null): Upper bound for the y bin range.

Returns (list[list]): 2D list of bin statistics, or error message string.

Examples

Example 1: Count statistic in 2x2 bins

Inputs:

x y values bstwod_statistic bins xlowerlimit xupperlimit ylowerlimit yupperlimit
0.1 2.1 1 count 2 0 1 2 3
0.1 2.6 1
0.1 2.1 1
0.6 2.1 1

Excel formula:

=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {1;1;1;1}, "count", 2, 0, 1, 2, 3)

Expected output:

Result
2 1
1 0

Example 2: Mean statistic in 2x2 bins

Inputs:

x y values bstwod_statistic bins xlowerlimit xupperlimit ylowerlimit yupperlimit
0.1 2.1 10 mean 2 0 1 2 3
0.1 2.6 20
0.1 2.1 30
0.6 2.1 40

Excel formula:

=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {10;20;30;40}, "mean", 2, 0, 1, 2, 3)

Expected output:

Result
20 20
40

Example 3: Sum statistic in 2x2 bins

Inputs:

x y values bstwod_statistic bins xlowerlimit xupperlimit ylowerlimit yupperlimit
0.1 2.1 1 sum 2 0 1 2 3
0.1 2.6 2
0.1 2.1 3
0.6 2.1 4

Excel formula:

=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {1;2;3;4}, "sum", 2, 0, 1, 2, 3)

Expected output:

Result
4 2
4 0

Example 4: Median statistic in 2x2 bins

Inputs:

x y values bstwod_statistic bins xlowerlimit xupperlimit ylowerlimit yupperlimit
0.1 2.1 10 median 2 0 1 2 3
0.1 2.6 20
0.1 2.1 30
0.6 2.1 40

Excel formula:

=BINNED_STATISTIC_2D({0.1;0.1;0.1;0.6}, {2.1;2.6;2.1;2.1}, {10;20;30;40}, "median", 2, 0, 1, 2, 3)

Expected output:

Result
20 20
40

Python Code

import math
from scipy.stats import binned_statistic_2d as scipy_binned_statistic_2d

def binned_statistic_2d(x, y, values, bstwod_statistic='mean', bins=10, xlowerlimit=None, xupperlimit=None, ylowerlimit=None, yupperlimit=None):
    """
    Computes a bidimensional binned statistic (mean, sum, median, etc.) for the input data.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic_2d.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        x (list[list]): Data to bin along the first dimension (x-axis).
        y (list[list]): Data to bin along the second dimension (y-axis).
        values (list[list]): Data on which to compute the statistic.
        bstwod_statistic (str, optional): The statistic to compute for each bin. Valid options: Mean, Sum, Median, Count, Standard Deviation, Min, Max. Default is 'mean'.
        bins (int, optional): Number of bins for both dimensions. Default is 10.
        xlowerlimit (float, optional): Lower bound for the x bin range. Default is None.
        xupperlimit (float, optional): Upper bound for the x bin range. Default is None.
        ylowerlimit (float, optional): Lower bound for the y bin range. Default is None.
        yupperlimit (float, optional): Upper bound for the y bin range. Default is None.

    Returns:
        list[list]: 2D list of bin statistics, or error message string.
    """
    # Helper function to normalize 2D list inputs
    def to2d(val):
        return [[val]] if not isinstance(val, list) else val

    # Convert scalars to 2D lists if needed
    x = to2d(x)
    y = to2d(y)
    values = to2d(values)
    # Flatten 2D lists to 1D
    try:
        flat_x = [float(item) for row in x for item in (row if isinstance(row, list) else [row])]
        flat_y = [float(item) for row in y for item in (row if isinstance(row, list) else [row])]
        flat_values = [float(item) for row in values for item in (row if isinstance(row, list) else [row])]
    except Exception:
        return "Invalid input: x, y, and values must be 2D lists or scalars of numbers."
    if len(flat_x) == 0 or len(flat_y) == 0 or len(flat_values) == 0:
        return "Invalid input: x, y, and values must not be empty."
    if len(flat_x) != len(flat_y) or len(flat_x) != len(flat_values):
        return "Invalid input: x, y, and values must have the same length."
    if bstwod_statistic not in ('mean', 'sum', 'median', 'count', 'std', 'min', 'max'):
        return "Invalid input: bstwod_statistic must be one of 'mean', 'sum', 'median', 'count', 'std', 'min', 'max'."
    try:
        nbins = int(bins)
    except Exception:
        return "Invalid input: bins must be an integer."
    if nbins < 1:
        return "Invalid input: bins must be >= 1."
    # Prepare range
    range_arg = None
    if (xlowerlimit is not None and xupperlimit is not None and
        ylowerlimit is not None and yupperlimit is not None):
        try:
            range_arg = [(float(xlowerlimit), float(xupperlimit)), (float(ylowerlimit), float(yupperlimit))]
        except Exception:
            return "Invalid input: xlowerlimit, xupperlimit, ylowerlimit, yupperlimit must be numbers."
    try:
        if range_arg:
            res = scipy_binned_statistic_2d(flat_x, flat_y, flat_values, statistic=bstwod_statistic, bins=nbins, range=range_arg)
        else:
            res = scipy_binned_statistic_2d(flat_x, flat_y, flat_values, statistic=bstwod_statistic, bins=nbins)
        stat = res.statistic
    except Exception as e:
        return f"scipy.stats.binned_statistic_2d error: {e}"
    # Return as 2D list (row-major), converting NaN to empty string for empty bins
    result = []
    for row in stat.tolist():
        result_row = []
        for val in row:
            if val is None or (isinstance(val, float) and math.isnan(val)):
                result_row.append("")
            else:
                result_row.append(float(val))
        result.append(result_row)
    return result

Online Calculator